Enhanced Byte Codes with Restricted Prefix Properties

نویسندگان

  • J. Shane Culpepper
  • Alistair Moffat
چکیده

Byte codes have a number of properties that make them attractive for practical compression systems: they are relatively easy to construct; they decode quickly; and they can be searched using standard byte-aligned string matching techniques. In this paper we describe a new type of byte code in which the first byte of each codeword completely specifies the number of bytes that comprise the suffix of the codeword. Our mechanism gives more flexible coding than previous constrained byte codes, and hence better compression. The structure of the code also suggests a heuristic approximation that allows savings to be made in the prelude that describes the code. We present experimental results that compare our new method with previous approaches to byte coding, in terms of both compression effectiveness and decoding throughput speeds.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variable-length Splittable Codes with Multiple Delimiters

Variable-length splittable codes are derived from encoding sequences of ordered integer pairs, where one of the pair’s components is upper bounded by some constant, and the other one is any positive integer. Each pair is encoded by the concatenation of two fixed independent prefix encoding functions applied to the corresponding components of a pair. The codeword of such a sequence of pairs cons...

متن کامل

On cardinality of network subspace codes

We analyze properties of different subspace network codes. Our study includes Silva-Koetter-Kshishang codes (SKK-codes), multicomponent codes with zero prefix (Gabidulin-Bossert codes), codes based on combinatorial block designs, Etzion-Silberstein codes (E-S codes) based on Ferrer’s diagrams, and codes which use greedy search algorithm and restricted rank codes. We calculate cardinality values...

متن کامل

Phrase-Based Pattern Matching in Compressed Text

Byte codes are a practical alternative to the traditional bit-oriented compression approaches when large alphabets are being used, and trade away a small amount of compression effectiveness for a relatively large gain in decoding efficiency. Byte codes also have the advantage of being searchable using standard string matching techniques. Here we describe methods for searching in byte-coded comp...

متن کامل

Decoding prefix codes

Minimum-redundancy prefix codes have been a mainstay of research and commercial compression systems since their discovery by David Huffman more than 50 years ago. In this experimental evaluation we compare techniques for decoding minimum-redundancy codes, and quantify the relative benefits of recently developed restricted codes that are designed to accelerate the decoding process. We find that ...

متن کامل

Relativized codes

A code C over an alphabet Σ is a set of words such that every word in C+ has a unique factorization over C, that is, a unique C-decoding. When not all words in C+ appear as messages, a weaker notion of unique factorization can be used. Thus we consider codes C relative to a given set of messages L, such that each word in L has a unique C-decoding. We extend this idea of relativizing code concep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005